SIEVE IN DISCRETE GROUPS, ESPECIALLY SPARSE 

EMMANUEL KOWALSKI 



Abstract. We survey the recent applications and developments of sieve methods re- 
lated to discrete groups, especially in the case of infinite index subgroups of arithmetic 
groups. 
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CN ■ 1. Introduction 

^ . 

O ■ Sieve methods appeared in number theory as a tool to try to understand the additive 

^ properties of prime numbers, and then evolved over the 20th Century into very sophisti- 

cated tools. Not only did they provide extremely strong results concerning the problems 
most directly relevant to their origin (such as Goldbach's conjecture, the Twin Primes 
^ ■ conjecture, or the problem of the existence of infinitely many primes of the form + 1), 

but they also became tools of crucial important in the solution of many problems which 
were not so obviously related (examples are the first proof of the Erdos-Kac theorem, 
and more recently sieve appeared in the progress, and solution, of the Quantum Unique 
Equidistribution conjecture of Rudnick and Sarnak). 

It is only quite recently that sieve methods have been applied to new problems, of- 
ten obviously related to the historical roots of sieve, which involve complicated infinite 
discrete groups (of exponential growth) as basic substrate instead of the usual integers. 
Moreover, both "small" and "large" sieves turn out to be applicable in this context to a 
wide variety of very appealing questions, some of which are rather surprising. We will 
attempt to present this story in this survey, following the mini-course at the "Thin groups 
^ ■ and super-strong-approximation" workshop. The basic outline is the following: in Sec- 

^ ■ tion [21 we present a sieve framework that is general enough to describe both the classical 

. examples and those involving discrete groups; in Section [21 we show how to implement a 

O ! sieve, with emphasis on "small" sieves. In Section [3, we take up the "large" sieve, which 

^ ■ we discuss in a fair amount of details since it is only briefly mentioned in p6] and has 

the potential to be a very useful general tool even outside of number-theoretic contexts. 
Finally, we conclude with a sampling of problems and further questions in Section [61 
^ I We include a result which has not appeared before (to the author's knowledge), namely 

■ a version of the Erdos-Kac Theorem in the context of affine sieve (Theorem 14.101) . which 

follows easily from the method of Granville and Soundararajan |17j . 

Apart from this, the writing will follow fairly closely the notes for the course at MSRI, 
and in particular there will be relatively few details and no attempts at the greatest 
known generality. The final section had no parallel in the actual lectures, for reasons of 
time. More information can be gathered from the author's Bourbaki lecture [2^ , or from 
Salehi-Golsefidy's paper in these Proceedings ^6] , and of course from the original papers. 
Overall, we have tried to emphasize general principles and some specific applications, 
rather than to repeat the more comprehensive survey of known results found in [ 26j . 

Notation. We recall here some basic notation. 



Key words and phrases. Expander graphs, Cayley graphs, sieve methods, prime numbers, thin sets, 
random walks on groups, large sieve. 
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- The letters p will always refer to a prime number; for a prime p, we write Fp for the 
finite field Z/pZ. For a set X, \X\ is its cardinality, a non-negative integer or +00. 

- The Landau and Vinogradov notation / = 0{g) and f ^ g are synonymous, and 
f{x) = 0{g{x)) for all x G -D means that there exists an "implied" constant C ^ 
(which may be a function of other parameters) such that ^ Cg{x) for all x E D. 
This definition differs from that of N. Bourbaki [H Chap. V] since the latter is of 
topological nature. We write f ^ g if f <^ g and g f . On the other hand, the notation 
f{x) ~ g{x) and / = o{g) are used with the asymptotic meaning of loc. cit. 

Reference. As a general reference on sieve in general, the best book available today is 
the masterful work of Friedlander and Iwaniec [H] • Concerning the large sieve, the author's 
book [22] contains very general results. We also recommend Sarnak's lectures on the affine 
sieve [19] . Another survey of sieve in discrete groups, with a particular emphasis on small 
sieves, is the Bourbaki seminar of the author [26], and Salehi-Golsefidy's paper [16] in 
these Proceedings gives an account of the most general version of the affine sieve, due to 
him and Sarnak [1?]. 

2. The setting for sieve in discrete groups 

Sieve methods attempt to obtain estimates on the size of sets constructed using local- 
global and inclusion-exclusion principles. We start by describing a fairly general frame- 
work for this type of questions, tailored to applications to discrete groups (there are also 
other settings of great interest, e.g., concerning the distribution of Frobenius conjugacy 
classes related to families of algebraic varieties over finite fields, see Ch. 8]). 

We will consider a group F, viewed as a discrete group, which will usually be finitely 
generated, and which is given either as a subgroup F C GLr(Z) for some r ^ 1, or more 
generally is given with a homomorphism 

: F — > GL,(Z), 

which may not be injective (and of course is typically not surjective). Here are three 
examples. 

Example 2.1. (1) We can take F = Z, embedded in GL2(Z) for instance, using the map 
This case is of course the most classical. 

(2) Consider a finite symmetric set S C SLr(Z), and let F = {S) C GLr(Z). Of 
particular interest for us is the case when F is "large" in the sense that it is Zariski- 
dense in SL^. Recall that this means that there exist no polynomial relations among all 
elements g E T except for those which are consequence of the equation (iei{g) = 1. A 
concrete example is as follows: for A; ^ 1, let 

and let F^'^) be the subgroup of SL2(Z) generated by Sk- It is well-known that for /c ^ 1, 
this is a Zariski-dense subgroup of SL2. 

We are especially interested in situations where F is nevertheless "small" , in the sense 
that the index of F in the arithmetic group SLj.(Z) is infinite. We will call this the sparse 
case (though the terminology thin is also commonly used, we will wish to speak later of 
thin subsets of SL^, as defined by Serre, and F is not thin in this sense). 
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In the example above, the groups F*^^^ = SL2(Z) and F*^^^ are of finite index in SL2(Z) 
(the latter is the kernel of the reduction map modulo 2), but F'^'^^ is sparse for all k ^ 3. 
In particular, the subgroup F*^^) is sometimes known as the Lubotzky group. 

(3) Here is an example where the group F is not given as a subgroup of a linear group: 
for an integer (7 ^ 1, let F be the mapping class group of a closed surface of genus g, 
and let 

: F ^ Sp2,(Z) C GL2,(Z) 

be the map giving the action of F on the first homology group Hi^Eg, Z) ~ Z^^, which 
is symplectic with respect to the intersection pairing on Z). Here it is known (for 

instance, through the use of specific generators of F mapping to elementary matrices in 
Sp2c,(Z)) that is surjective. (All facts on mapping class groups that we will use are 
fairly elementary and are contained in the book of Farb and Margalit ^.) 

The next piece of data are surjective maps 

Hp : F — > Tp 

where p runs over prime numbers (or possibly over a subset of them) and Fp are finite 
groups. We view each such map as giving "local" information at the prime p, typically 
by reduction modulo p. Indeed, in all cases in this text, the homomorphism vTp is the 
composition 

F A GL,(Z) ^ GL,(Fp) 

of (f) with the reduction map of matrices modulo p, and Fp is defined as the image of this 
map. 

Example 2.2. (1) For F = Z, reduction modulo p is surjective onto Fp = Z/pZ for all 
primes. 

(2) If F is Zariski-dense in SL,., and we use reduction modulo p to define vTp, it is a 
consequence of general strong approximation statements that there exists a finite set of 
primes T(F) such that vTp has image equal to SLr(Fp) for all p ^ ^(F), and in particular 
for all primes large enoughj^ For instance, in the case of the subgroups F*^'^^ C SL2(Z), 
this property is visibly valid with 

_ jprinies p dividing k}. 

We refer to the survey |13] by Rapinchuk in these Proceedings for a general account of 
Strong Approximation. 

(3) For the mapping class group F of Sg, and given by the action on homology, the 
image of reduction modulo p is equal to Sp2g(Fp) for all primes p (simply because is 
onto, and Sp2g(Z) surjects to Sp2g(Fp) for all p). 

We want to combine the maps VTp, corresponding to local information, modulo many 
primes in order to get "global" results. This clearly only makes sense if using more than 
a single prime leads to an increase of information. Intuitively, this is the case when the 
reduction maps Hp, vr^, associated to distinct primes p and q are independent: knowing 
the reduction modulo p of an element of F should give no information concerning the 
reduction modulo q. We therefore make the following assumption on the data: 



This is directly related to the fact that SL^ is, as a linear algebraic group, connected and simply 
connected. 

3 



Assumption (Independence). There exists a finite set of primes T'i(r), sometimes called 
the r -exceptional primes, such that for any finite set / of primes p ^ Ti{r), the simulta- 
neous reduction map 

modulo primes in / is onto. 
We will write 

pG/ pel 
Note that qi is a squarefree integer, coprime with Ti(r). 

Example 2.3. (1) For F = Z, the Chinese Remainder Theorem shows that for any finite 
set of primes I, we have 

J]Z/p,Z~Z/g,Z, 

pe/ 

and hence the map tt/ above can be identified with reduction modulo qj. In particular, 
it is surjective, so that the assumption holds with an empty set of exceptional primes. 

(2) If F C GLr(Z) has Zariski closure SL,., then the Assumption holds for the same set 
of primes Ti(F) = T(F) such that Hp is surjective onto SLr(Fp) for p ^ T(F), simply for 
group-theoretic reasons: any subgroup of a finite product 

nsL.(F,) 

pel 

which surjects to each factor SL.r(Fp) is equal to the whole product (this type of result is 
known as Goursat's Lemma, see, e.g., [6l Prop. 5.1] or as Hall's Lemma [3 Lemma 3.7]). 
Again a similar property holds if the Zariski closure of F is an almost simple, connected, 
simply-connected algebraic group. 

(3) In particular, the Independence Assumption holds with Ti(F) = for the mapping 
class group of acting on the homology of the surface, because Goursat's Lemma applies 
to the finite groups Sp2c,(Fp). 

(4) The Independence Assumption may fail, for instance in the context of orthogonal 
groups, when there is a global invariant which can be read off any reduction. The simplest 
example of such an invariant is the determinant: if F C GLr(Z) is not contained in SLr(Z), 
the compatibility condition 

det{7rp{g)) = det{g) e {±1} C F^^ 

valid for all p and g G GL,.(Z) shows that the image of ttj is always contained in the 
proper subgroup 

{{gp) e Tj I det{gp) = det{gq) for all p, q E 1} 

(identifying all copies of {±1}). This issue appears, concretely, in the example of the Apol- 
lonian group and Apollonian circle packings, since the latter is a subgroup of an indefinite 
orthogonal group intersecting both cosets of the special orthogonal group, see [10^ JJJ for 
a precise analysis of this case. 

It should be emphasized that this failure of the Independence Assumption is not dra- 
matic: one can replace F by FflSLr for instance, or by the other coset of the determinant 
(with some adaptation since this is not a group). 
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We can now define the sifted sets S C F constructed by inclusion-exclusion using local 
information: given a set 7 of primes (usually finite), and subsets 



for p G CP, we let 

S = §(y; Q) = {g^Y \ ng{g) i % for all p G T} = f| (F - vr^ ^(fip)). 

We want to know something about the size, or maybe more ambitiously the structure, 
of such sifted sets. In fact, quite often, we wish to study sets which are not exactly of 
this shape, but are closely related. 

Example 2.4. (1) Let F = Z, and let Vtp = {0, —2} C Fp for all primes p ^ Q, where 
Q ^ 2 is some parameter. Then we have by definition 

S{Q) = {n G Z I neither n nor n + 2 has a prime factor ^ Q}- 

In particular, for N 1, the initial segment Sn{l, . . . , A^} contains all "twin primes" n 
between Q and A^, i.e., all primes p with Q < p ^ N such that p + 2 is also prime. Hence 
an upper-bound on the size of this initial segment will be an upper-bound for the number 
of twin primes in this range. This is valid independently of the value of Q. Furthermore, 
if Q ^ \/N + 2, we have in fact equality: an integer n G S(viV + 2) fl {1, . . . , N} must 
be prime, as well as n + 2, since both integers only have prime factors larger than their 
square-root. More generally, if Q = N^^ for some /3 > 0, we see that §{Q) fl {1, . . . , N} 
contains only integers n such that both n and n + 2 have less than prime factors. 

(2) The first example is the prototypical example showing how sieve methods are used 
to study prime patterns of various type. Bourgain, Gamburd and Sarnak [3] extended 
this type of questions to discrete subgroups of GLr(Z). We present here a special case 
of what is called the affine linear sieve or the sieve in orbits. There will be a few other 
examples below, and we refer to the original paper or to [26j for a more general approach. 

We assume for simplicity, as before, that F is Zariski-dense in SL^. Let 

/ : SL„(Z) Z 

be a non-constant polynomial function, for instance the product of the coordinates. We 
want to study the multiplicative properties of the integers f{g) when g runs over F. 
Consider 

(2.1) fip = {(7 G Fp I f\g) = (modp)} C Fp C SL,(Fp), 

for p ^ Q. Then S{Q; Q) is the set of G F such that f{g) has no prime factor ^ Q. In 
particular, for any A > 0, the intersection 

§{Q)n{geT I \f{g)\^Q^} 

consists of elements where f{g) has < A prime factors. For instance, when / is the 
product of coordinates, this set contains elements G F where all coordinates have less 
than A prime factors. 

(3) For our last example, consider the mapping class group F of S^. Let 'Kg be a 
handlebody with boundary S^. For a mapping class G F, we denote by the compact 
3-manifold obtained by Heegaard splitting using Kg and 0, i.e., it is the union of two 
copies of "Kg where the boundaries are identified using (a representative of) </> (see [7] for 
more about this construction). 
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The image J of Hi^Kg, Z) ~ in Hi^Eg, Z) ~ Z^^ is a lagrangian subspace (i.e., a 
subgroup of rank g such that the intersection pairing is identically zero on J). We denote 
by Jp C its reduction modulo p. It follows from algebraic topology that 

/7i(M<^,Z)~ifi(S„Z)/(J,0- J), 

H,{M^, Fp) ~ iJi(M^, Z) ® Fp ^ Fp)/(Jp, ■ Jp). 

Thus if we let 

(2.2) fip = {7 G Sp2,(Fp) I 7 . J, n = 0} = {7 e Sp2,(Fp) I (Jp,7 • Jp) = Fj^}, 

we see that any sifted set §(7;^) contains all mapping classes such that M,^ has first 
rational Betti number positive. 

We will discuss this example further in Section [51 The reader who is not familiar with 
sieve is however encouraged to try to find the answer to the following question: What is 
the great difference that exists between this example and the previous ones? 

3. Conditions for sieving 

Having defined sifted sets and seen that they contain information of great potential 
interest, we want to say something about them. The basic question is "How large is a 
sifted set S?" In order to make this precise, some truncation of S is needed, since in 
general this is (or is expected to be) an infinite set. In fact, we saw in the simplest 
examples (e.g., twin primes) that this truncation (in that case, the consideration of an 
initial segment of a sifted set) is crucially linked to deriving interesting information from 
§, as one needs usually to handle a truncation which is correlated with the size of the 
primes in the set 7 defining the sieve conditions. 

When sieving in the generality we consider, it is a striking fact that there are different 
ways to truncate the sifted sets, or indeed to measure subsets of T in general (although 
those we describe below seem, ultimately, to be closely related.) We will speak of "count- 
ing methods" below to refer to these various truncation techniques. 

Method 1. [Archimedean balls] Fix a norm || ■ || (or some other metric) on the ambient 
Lie group GLr(R) (for instance the operator norm as linear maps on euclidean space, but 
other choices are possible) and consider 

sn{(7Gr I \\g\\^T} 

for some parameter T ^ 1. This is a finite set, and one can try to estimate (from above 
or below, or both) its cardinality. 

Example 3.1. Let F be a Zariski-dense subgroup of SL^, and / a non-constant polynomial 
function on SLr(Z). For some d ^ 1, we have 

1/(^)1 «ii^?r 

for all (7 G F. Hence if we consider the sifted set f l2.ip for Q = T^, the elements in 

S{Q)n{geT I \\g\\^T} 
are such that f{g) has at most d/P prime factors. 

Counting in archimedean balls in subgroups of arithmetic groups, even without involv- 
ing sieve, is a delicate matter, especially in the sparse case, which involves deep ideas from 
spectral theory, harmonic analysis and ergodic theory. We refer to the book of Gorodnik 
and Nevo [H] for the case of arithmetic groups, and to Oh's surveys [38] and [39] for the 
sparse case. 
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Method 2. [Combinatorial balls] Since the groups F of interest are most often finitely 
generated, and indeed sometimes given with a set of generators, one can replace the 
archimedean metric of the first method with a combinatorial one. Thus if S" = is 
a generating set of F, we denote by is{g) the word- length metric on F defined using S. 
The sets 

S n G F I isig) ^ T}, or S n G F I £3(9) = T}, 

are again finite, and one can attempt to estimate their size. 

This method is particularly interesting when S" is a set of free generators of F (and their 
inverses), because one knows precisely the size of the balls for the combinatorial metric 
in that case. And even if this is not the case, one can often find a subgroup of F which is 
free of rank ^ 2, and use this subgroup instead of the original F (this technique is used 
in [3]; in that case, the necessary free subgroup is found using the Tits Alternative, a 
very specific case of which says that if F is Zariski-dense in SL^, then it contains a free 
subgroup of rank 2.) 

Method 3. [Random walks] Instead of trying to reduce to free groups using a sub- 
group, one can replace F by the free group F{S) generated by S and use the obvious 
homomorphisms 

: F{S) ^ F ^ GL,(Z) 

and 

F{S) F ^ Fp 

to define sieve problems and sifted sets. An alternative to this description is to use 
the generating set 5* and count elements in balls for the word- length metric is with 
multiplicity, the multiplicity being the number of representations of (7 G F by a word of 
given (or bounded) length. This means one measures the size of a set X C F truncated 
to the sphere of radius N ^ 1 around the origin by its density 

MX) = j^\{{si, ...,SN)eS'' \ s,---SNeX}\ 

and therefore one tries to measure the density of the sifted set /iAr(S), as a way of mea- 
suring its size within a given ball. If one wishes to measure balls instead of spheres, a 
simple expedient is to replace S'by5'i = S'U{l} (since the sphere of radius for isi is 
the ball of radius A^ for is)- 

It is often convenient to think of this in terms of a random walk: one assumes given, 
on a probability space fl, a sequence of independent 5- valued random variables and 
one defines a random walk (7^) on F by 

7o = 1, 7n+i = 7n^n+i for u ^ 0. 

If all steps C,n are uniformly distributed on 5*, it follows that 

/i^(X)=P(7jvGX), 

or in other words, the density /^at is the probability distribution of the A^-th step of this 
random walk. 

Example 3.2. The analogue (for Methods 2 and 3) of the argument in Example 13.11 is 
the following: given a function / as in that example, there exists C ^ 1 such that, for all 
(7 G F, we have 

\f{9)\^C'''^^'^ 

(simply because the operator norm of g is submultiplicative and hence grows at most 
exponentially with the word- length metric). Thus elements which have word-length at 
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most and belong to a sifted set S{Q; Q) with Q of size , for some A > 1, have at 
most (log A) / (log C) prime factors. 

Example 3.3 (Dunfield-Thurston random manifolds). This third counting method is 
the least familiar to classical analytic number theory. This random walk approach was 
however already considered by Dunfield-Thurston [7] as a way of studying random 3- 
manifolds, using the Heegard-splitting construction based on mapping class groups as in 
Example \2A\ (3): given an integer g ^ 1, they consider a finite generating set S of the 
mapping class group T of and the associated random walk (0„). The 3-manifolds M</,„ 
are then "random 3-manifolds" and some of their properties can be studied using sieve 
methods. 

It is of course useful to have a way of considering these three methods in parallel. This 
can be done by assuming that one has a sequence (/xat) of finite measures on F, and by 
considering the problem of estimating /iAr(S), the measure of the sifted set. In Method 
1, these measures would be the uniform counting measure on the intersection of F with 
the balls of radius N in GLr(R), in Method 2, the uniform counting measure on the 
combinatorial ball of radius N, and in Method 3, the probability law of the iV-th step of 
the random walk. 

4. Implementing sieve with expanders 

We will now explain how all this relates to expanders. The one-line summary is that 
the expander condition will allow us to apply classical results of sieve theory to settings 
of discrete groups "with exponential growth" (one might prefer to say, "in non-amenable 
settings"). We can motivate this convincingly as follows. 

The simplest possible sieve problem occurs when the set T of conditions is restricted 
to a single prime, and one is asking for 

^iN{{g e F I TTgig) = go}) 

for a fixed prime p and a fixed go G Fp. One sees that, assuming p is fixed, this elementary- 
looking question concerns the distribution of the image of the sequence t^p^^,[1n of measures 
on the finite group Fp. This may well be expected to have a good answer. 

Example 4.1. Consider (one last time) the classical case F = Z. If we truncate by 
considering initial segments {!,..., A^}, we are asking here about the number of positive 
integers ^ congruent to a given a modulo p. The proportion of these converges of 
course to and this is usually so self-evident that one never mentions it specifically. 
(But, still in classical cases, note that if one starts the sieve from the set of primes instead 
of Z, then this basic question is resolved by Dirichlet's Theorem on primes in arithmetic 
progressions, and the uniformity in this question is basically the issue of the Generalized 
Riemann Hypothesis.) 

On intuitive grounds as well as theoretically, one can expect that the "probability" that 
g reduces modulo p to go should be about l/|Fp|. This amounts to expecting that the 
probability measures Tip^^.{jj) j jj,N{T) converge weakly to the uniform (Haar) probability 
measure on this finite group. It is when considering uniformity of such convergence that 
expander graphs enter the picture. 

We can already deduce from this intuition the following heuristic concerning the size 
of a sifted set §(?"; VL): each condition T^p{g) ^ Vtp should hold with "probability" approx- 
imately 
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and these sieving conditions, for distinct primes, should be independent. Hence one may 
expect that 

(4.1) i^NiH^; 0)) « fi^iT) n (i - Tfr) 

(where the symbol ~ here only means that the right-hand side is a first guess for the 
left-hand side...) 

The simplest counting method to explain this is Method 3, where the argument is 
very transparent. We therefore assume in the remainder of this section that /Xjv is the 
probability law of the iV-th step of a random walk on T as above. 

It is then an immediate corollary of the theory of finite Markov chains (applied to 
the random walk on the Cay ley graph of Tp induced by that on F) that, if 1 e 5" (or 
more generally if this Cayley graph is not bipartite, i.e., if there exists no surjective 
homomorphism Tp — > {±1} such that each generator s E S maps to —1), we have 
exponentially-fast convergence to the probability Haar measure. Precisely, let Mp be the 
Markov operator acting on functions on Fp by 



This operator also acts on functions of mean 0, i.e., on the space -^ol^p) of functions 
such that 

E ^(^) = 0' 

and has real eigenvalues. Let < 1 be its spectral radius (it is < 1 because the eigenvalue 
1 is removed by restricting to Lq, while —1 is not an eigenvalue because the graph is not 
bipartite) . We then have 



I^nM9) ^ 9o)- 



r^l 



for all N ^1. 

More generally, under the Independence Assumption, if 7 is a finite set of primes not 
in T'i(r), the same argument applied to the quotient 



pel 



shows that that for any (gp) G F/, we have 



^ Qi 



(4.2) |/xiv (npig) = for p e - JJ -J- 

pei ' 

where qj < 1 is the corresponding spectral radius for F/. It follows by summing over 
X — [qp) e Fj that we have a quantitative equidistribution 

(4.3) j ^{{'Kp{g))pei)dMg) = ^ E ^{x) + 0{\Ti\y\\^Q'^) 



(with an absolute implied constant) for any function (p on F/. 

9 



In particular, we see that if CP is a fixed set of primes (not in Ti(r)), then as N +00, 
the basic heuristic fl4.ip is vahd asymptotically: 

(4.4) hm /z^(S(y;fi))= hrn P(^^ e S(T; r^)) = J] (l " ^f) 

(we will call this a "bounded sieve" statement). 

The difficulty (and fun!) of sieve methods is that the sifted sets of most interest are such 
that the primes involved in CP are not fixed as N ^ +00: they are in ranges increasing 
with the size of the elements being considered (as shown already by the example of 
the twin primes). It is clear that in order to handle such sifted sets, we need a uniform 
control of the equidistribution properties modulo primes, and modulo finite sets of primes 
simultaneously. The best we can hope for is that fl4.2p hold with the spectral radius 
bounded away from one independently of I. This is, of course, exactly the conditions 
under which the family of Cayley graphs of Tj with respect to the generators 5 is a 
family of (absolute) expander graphs. 

Remark 4.2. We have discussed the example of the random walk counting method. It is 
a fact that analogues of fl4.2p hold in all cases where sieve methods have been successfully 
applied. Moreover, these analogues hold uniformly with respect to /, and ultimately, the 
source is always equivalent to the expansion property of the Cayley graphs, although the 
proofs and the equivalence might be much more involved than the transparent argument 
that exists in the case of random walks. 

Example 4.3. The first case beyond the classical examples (or the case of arithmetic 
groups, where Property (T) or (r) can be usedjl although this also had not been done 
before) where sieve in discrete groups was implemented is due to Bourgain, Gamburd 
and Sarnak [3], who (based on earlier work of Helfgott [T9] and Bourgain-Gamburd [2]) 
proved that if F is a finitely-generated Zariski-dense subgroup of SL2(Z) (or even of 
SL2(0), where is the ring of integers in a number field), the Cayley graphs of Tj, where 
/ runs over finite subsets of Ti(r), form a family of (absolute) expanders. The problem of 
generalizing this to SL,,, or to Zariski-dense subgroups of other algebraic groups, was one 
of the motivations for the recent developments of this result, and of the basic "growth" 
theorem of Helfgott, to more general groups. We now know an essentially best possible 
result (see [HI El] , and the surveys [IHl El |12] of Salehi-Golsefidy, Breuillard and Pyber- 
Szabo in these Proceedings for introductions to this area): 

Theorem 4.4 (Salehi-Golsefidy- Varju). Let F C GL,.(Z) be finitely generated by S = 
S^^ , with Zariski- closure G. For p prime, let Tp be the image of F under reduction 
modulo p, and for a finite set of primes I, let Tj be the image of F in 

p€l 

under the simultaneous reduction homomorphism. 

If the connected component of the identity in G is a perfect group, then there exists a 
finite set of primes T'i(F) such that the family of Cayley graphs ofVj, for I n T'i(F) = 0, 
is an expander family. 

We can now describe what is the implication of some classical sieve results in the 
context of discrete groups. We assume formally the following: 



See the works of Gorodnik and Nevo for the best known in this direction. 
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Assumption (Expansion). Tliere exists a finite set of primes T2(r) such tliat F satisfies 
tlie Independence Assumption for primes not in T2{T), and furtliermore tlie family of 
Cayley graphs of Tj, for /nT2(r) = 0, is an expander family, i.e., there exists g < I, such 
that for any finite set I of primes p ^ T2{T), the spectral radius for the Markov operator 
on Tj satisfies 

By (14. 2|) . this assumption implies that the asymptotic formula 

(4.5) PMln) = gpioTpel) ^H-^ 

pel ' 

holds uniformly for n ^ 1 and sets / such that |r/| ^ for any g > g. If we assume 
that 

(4.6) iFpl ^ 

for some fixed B ^ 1, this means that we can control simultaneously and uniformly all 
reductions of the A-th step as long as qi ^ ^-"Z-^. Note that (jM]) is not very restrictive: 
it holds (with B = r^) if Hp is just the reduction modulo p on GL,.(Z), which is the case 
in all our applications. 

The most classical types of sieve are those when the sieving conditions determined by 
flp hold with probability approximately n/p, at least on average, were k is a fixed real 
number traditionally called the dimension of the sieve. Precisely, we say that (Qp) is of 
dimension k if we have 

(4.7) ^B| = ^loglogA + 0(l) 
for X ^ 2. Note that this is certainly true, for instance, if 

I I ^ ^ /" 1 



O 



\Tp\ p \p^~^^ 

for some S > and all p prime. 

We then have the following basic result: 

Theorem 4.5 (Small sieve in discrete groups). Let T be a discrete group finitely generated 
by S = S^^, given with : F — > GLr(Z) and surjective homomorphisms Tip to finite 
groups Tp as above, in particular with f l4.6p for some fixed B ^ 1. Assume that T satisfies 
the Independence and the Expansion assumptions. Let (7„) denote a random walk on T 
using steps from S , and let fin denote the probability law of the n-th step. 
Let Qp C Tp be finite sets such that ( 14.71) holds for some k > 0. 

There exists A > such that, for all n 1, if we let Q = A^ and take 7 to be the set 
of primes p ^ Q with p ^ T2{T), then we have 



4:«P(7nGS(y;f^))«4: 



for all N large enough. 



This is essentially a direct consequence of the standard Brun-type sieve, building on 
the Independence and Expansion assumptions. The mechanism is explained in [26], and 
to avoid repetition, we will not give further details here. We simply add a few remarks. 
First, this result confirms the heuristic (14. ip as far as the order of magnitude is concerned, 
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i.e., up to multiplicative constants. Indeed, the right-hand side of f l4.ip is, in this case, 
given by 




vm{T) 



as n — )■ +CXD, by fl4.7p and the Mertens Formula (or the Prime Number Theorem.) Sec- 
ondly, the result is best possible in the sense that one can not replace the inequalities up 
to multiplicative constants by an asymptotic formula in this generality (this is also seen 
from the Mertens Formula and the Prime Number Theorem). Finally, the result is by no 
means an easy consequence of (14. 2 p and the uniformity afforded by expansion. 

Example 4.6 (Sieve in orbits). We illustrate the above result by deriving, as a corollary, 
a special case of the sieve in orbits (or affine linear sieve) of [3]. 

Let r be Zariski-dense in SL,. with r ^ 2, and generated by the finite set S = S~^. We 
take for VTp the reduction maps. Let 

/ : SL,(Z) ^ Z 

be a non-constant polynomial map and let Qp C SLr(Fp) be the set of zeros of /. Since 
/ is non-constant, the algebraic subvariety Zj of SL^ defined by the equation / = is a 
hypersurface in SL^, and consequently, by the Lang- Weil estimates, we have 

(4.8) §j = ^ + 0(p-3/^) 

ir^l p 

where Kp is the number of geometrically irreducible components of the reduction of Zf 
modulo p and the implied constant depends only on /. A further application of the 
Chebotarev density theorem shows that 

p<:x 

where k is the number of irreducible components of Zf (over Q; in the simplest cases, all 
of these components are themselves defined over Q, and then we have Kp = k for all but 
finitely many p) . 

Thus all assumptions of Theorem 14.51 hold (the Expansion assumption coming from 
Theorem 14.40 . and we deduce that there exists a finite set of primes T and A > 1 such 
that if 7 is the set of primes not in T and ^ A", we have 

P(7„gS(T; ^]))xn-^ 

Using Example 13.21 we therefore deduce: 

Theorem 4.7 (Sieve in orbits; Bourgain-Gamburd-Sarnak) . Let T and f be as above. 
There exists u ^ 1 such that the set 0/(w) of all g & T such that f{g) has at most u 
prime factors satisfies 

(4.9) Pi^neOfiuj))-n-^ 
for n large enough. 

One of the insights of Bourgain-Gamburd-Sarnak was that such a statement has a 
more qualitative corollary which is already very interesting and doesn't require any con- 
sideration of a special counting method: 

Corollary 4.8. Let T and f be as above. There exists u ^ 1 such that the set Of{uj) is 
Zariski-dense in SLr- 
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Proof. It is enough to check that if a subset X C F is not Zariski- dense, then a lower- 
bound 

P(7n G X) > n-^ 

does not hold for any k, > 0, since Of{u) C Z would then contradict the sieve lower 
bound fl4.9p (note that here (7„) is just an auxiliary tool). 

Given X, there exists a non-trivial function / such that X C Zf. Then, for any prime 
p (large enough so that reduction of / modulo p makes sense) the image of X modulo 
p is contained in the zeros of / modulo p. But using (14 .Sp and summing (14 .Sp over the 
zeros of / modulo p, we have 

P(7rp(7„) e Zf (modp)) ~ Kpp'^ 

uniformly for p ^ A"' for some A > 1. Taking p of size A", we deduce 

P(7„ G X) ^ PMln) G Zf (modp)) < A"" 

for n large enough. Thus the probability to be a zero of a given function / is in fact 
exponentially small for a long walk, and this contradicts the lower bounds for Of{u). □ 

In fact, as noted in ^26j and as we will see in the next section, this has a natural 
refinement where Zariski-dense is replaced by "not thin" in the sense of Serre. 

Note that Salehi-Golsefidy and Sarnak [47J have extended the basic small sieve state- 
ment to much more general groups, not necessarily reductive, using the full power of 
Theorem 14.41 together with special considerations to handle unipotent groups. 

Example 4.9. Theorem 14.51 also applies in the context of Dunfield-Thurston manifolds, 
as in Example 13.31 Indeed, the Expansion Assumption is here a consequence of Property 
(T) for Sp2g(Z). As observed in [26], a consequence of Theorem 14.51 which is similar in 
spirit to the affine linear sieve is that there exists u ^ 1 such that 

P(ifi(M0„, Z) is finite and has order divisible by ^ w primes) x 

for n large enough. (We recall that the genus g defining the Heegard splitting is fixed). 

One can certainly use the sieve setting for many other purposes. As one further ex- 
ample, we show how the method of Granville and Soundararajan pTl Prop. 3] gives a 
version of the Erdos-Kac Theorem for discrete groups. For simplicity, we only state the 
result for the affine sieve, and give one further example afterwards. 

Theorem 4.10 (Erdos-Kac central limit theorem for affine sieve). Assume that F C 
SLj.(Z) is Zariski-dense in SL,, and f is a non-constant polynomial function satisfying 
the assumptions of Theorem \4. 7[ For a random walk (7^) on T, letojfipfn) = ^{.filn)) if 
filn) 7^ 0, and Uf{'jn) = otherwise. Then the random variables 

^f{ln) - K\ogn 
log n 

converge in law to the standard normal random variable as n ^ +00. 

Proof. We proceed exactly as in [T7], leaving some details to the reader. This uses the 
method of moments to prove convergence in law to the normal distribution: classical 
probability results imply that it is enough to prove that for all integers ^ 0, we have 

^//uy^^ognx^ 

VV a/ k log n / / 

as n — )■ -|-oo, where Ck = E(?sf(0, l)'^) is the k-th moment of a standard normal random 
variable. 
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We first deal with the possibility that /(7n) 7^ 0. By bounding 

P(/(7n) = 0) ^ P(/K(7)) = 0) 
for any prime p large enough, and arguing as in the proof of Corollary 14.81 we get 

P(/(7n) = 0) « C^'^ 

for some c > 1. Thus the expectation above, restricted to the set /(7„) = 0, is 

< (Klogn)'^/^^-" — > 

as n — 7- +00. 

Below we use the notation E to denote expectation restricted to /(7„) 7^ 0. We fix 
some integer ^ 0, and fix some auxiliary A > 1. We will compare 

Mfc = E((u;^(7„)-ft:logn)^) 

with the moment of "truncated" count of primes dividing /(7„) defined by 



'n'9(7»i)GS7p 



where 

\a 

71. 



PI 



and then estimate asymptotically this second moment when A > 1 is small enough with 
respect to k. 

For the first step, we note that when f{'jn) 7^ 0, we have 

(^filn) - Klogn = Ai + A2 + A3 

where 



A,=uU{ln))- 1 

p^A" 

A3 = TTp - /t log n 

p^A" 

If C > 1 is such that |/(^)| ^ C^sig) ^ then we get 

^ ^ -A^, 
log A 

while, by fl4.7p . we have 



^3 = ^ TTp - K log = Y 1^ - Klogra = 0(1), 



so that A2 + ^3 is uniformly bounded for a fixed choice of A. Using the multinomial 
theorem, it follows that 

Mk = Nk + 0{ma.-xNj), 
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where 

TTg (7„)ef2p 

We have Nj = Nj is even and if j is odd, we get 

by the Cauchy-Schwarz inequahty, showing that good understanding of Nj for j ^ k will 
suffice to estimate M^. 

For the second step, we write 

for p ^ A^, sum over p, and open the fc-th power defining N^. Note that \Xp\ ^ 1. 
Exchanging the multiple sum over primes and the expectation, we get 

]v.=5:...^E(nx„). 

Pi,...Pfes:A" j=l 

For any fixed (pi, . . . ,Pk), we note that 

k k k 

j=i j=i j=i 

and the second term is bounded by P(/(7„) = 0) ^ c~" since ^ Xp^ ^ 1. Thus the 
total change in replacing E by E in the formula above for A^^ is ^ A^^c~^, which is 
negligible if A is chosen small enough. 
Having written 

= E ■ ■ ■ E E(n ^p.) = = 5Z • ■ ■ 5Z ^(n + oiA-'^^-i 

Pi,...Pk^A^ j=l pi,...pk^A" j=l 

we can apply the equidistribution fl4.3l) to each expectation term, obtaining a main term 
which we will discuss in a moment and a total error term E which is bounded by 

(where B is as in (14. 6p ). Therefore E tends to as n — i- +oo if A is chosen small enough 
(in terms of k), which we assume to be done. 

There remains the main term. However, the latter is, by the Independence Assumption 
and by retracing our steps, almost tautologically the same as 

E(($^i;-vr,)') 

p^A" 

where the (Yp) are independent Bernoulli random variables with expectation Hp = |f2p|/|rp 
It is a basic probabilistic fact that the sum 

p^A" 

satisfies the Central Limit Theorem, with mean Klogn and variance Klogn (because 
of (14. 7p again). Therefore this sum has the right k-th moment for all /c ^ 0, and this 
easily concludes the proof (or see [T7] for a direct analysis of this type of main terms to 
see the combinatorics from which the normal moments explicitly appear). □ 
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Example 4.11 (Erdos-Kac theorem for random 3- manifolds) . It is clear that the argu- 
ment can be applied in greater generality (including for other counting methods, pro- 
vided the analogue of quantitative and suitably uniform equidistribution is known). For 
instance, one sees that, for Dunfield-Thurston random 3-manifolds, the number a;(M<^^) 
of primes p such that ifi ((/)„, Fp) 7^ is such that 

converges to a standard normal random variable, with the convention a;(M(^„) = if 
i/i(M^„,Q)^0. 

5. The large sieve 

We begin with a motivating example. 

Example 5.1. Consider Corollary 14.81 Although the Zariski topology contains a fair 
amount of information (see ^) for examples of distinction it makes concerning the sieve 
in orbits), it is not very arithmetic. By itself, the fact that Of{oj) is Zariski-dense in SL^ 
does not exclude the possibility that this set is contained, for instance, in the subset X 
of SL.f.(Z) of matrices where the top- left coefficient is a perfect square (since X is Zariski- 
dense in SL^.) It is natural to try to study this and similar possibilities. The following 
definition is relevant (see [Sni Chapter 3]): 

Definition 5.2 (Thin set). A subset X C SLr(Q) is thin if there exists an algebraic 
variety W/Q with dim(14^) ^ — 1 and a morphism W SL^ such that (1) vr has no 
rational section; (2) we have X C 7i{W{Q)). 

Example 5.3. (1) The set X = {g E SLr(Q) | gi^i is a square} is thin. Indeed, we have 
a Q-morphism 

TT : A"' A"' 

mapping (Qij) to the matrix (hi j) with hi i = gf^ and all other coordinates unchanged, 

2 

and the pull-back of this morphism to SL,. C A*' gives a morphism 

IT : W SL„ 

with X C 7r(iy(Q)) by construction (and dimW^ ^ dimSL^ is clear since tt has finite 
fibers.) 

(2) A subset X which is not Zariski-dense is thin. 
We wish to prove: 

Proposition 5.4. Let T and f be as in CoroUary \4.S[ Then there exists u ^ 1 such that 
Of{uj) is not thin in SL^.. 

The natural idea to prove this is to prove that if X is a thin set, then for a random 
walk on F, the probability 

P(7n e X) 

is too small to be compatible with (14.91) . For this, we observe, as in the proof of the 
Zariski-density, that if X C tc(W{Q)) for some 

TT : ly — > SL, 

as in the definition, we have 

7rp(X) C n{W{Fp)), 
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for all primes p large enough (such that W and tt make sense modulo p). Hence if (7 G X, 
we have 

7rp{g)^n^ = SU{F,)-7r{W{Fp)), 

for all p large enough. This implies a sieve upper bound 

X C §{?; n), 

where CP contains all but finitely many primes. 

However, the size of Qp is typically much larger than the number of points of an 
algebraic variety, as one can guess by just looking at the example of squares in Q, where 
the image modulo p contains roughly half of all residue classes. Indeed, in general we 
have: 

Lemma 5.5. Let it : W — > SL^ be a (^-rational morphism with dimVI^ ^ — 1 and 
with no Q-rational section. There exists 6 < 1 such that, forp large enough, we have 



HW{Fp))\ 
|SL,(Fp)| 



^ 6. 



For the proof, see e.g. [501 Th. 3.6.2]. 

Example 5.6 (Homology of Dunfield-Thurston random manifolds). We consider the 
situation of Example 12. 4[ (3). Here we found sifting conditions Qp defined in (12. 2 p such 
that, if M(^„ denotes the manifold obtained from the n-th step of a random walk on the 
mapping class group T (as in Example I3.3p . we have 

P(/7i(M^„, Q) ^ 0) ^ P(0„ G S(g, Qp)) 

for any Q ^ 1, where Q refers to using all primes p ^ Q as sieve conditions. It is an 
interesting computation to show that 



"PI 



1 - 



nr 



|r,l f^i + p-' 

(see ^ Th. 8.4]) so that, for fixed g, there exists Sg > for which 



for all p. 



We now revert to the general setting of a discrete group F with local information 
TTp : F — )■ Fp. We have found above two natural instances of large sieves, a terminology 
which refers to sieving problems where the sets Vlp are "large", something which most 
commonly means that they contain a positive proportion of F^: for some 5 > 0, we have 

(5.1) 1FT^^>0 

for all p G CP. This is to be compared with the "small" sieve assumption (14. 7p . and 
this leads to an interesting remark (answering the question to the reader at the end of 
Example 12. 4[ (3)): the primes occur explicitly on both sides of (14. 7p . but as far as the 
left-hand side is concerned, they are just indices that could be replaced with any other 
countable set. However, on the right-hand side, the actual size of primes (and hence their 
distribution) is involved. This feature disappears in (15. ip . This suggests that the "large" 
sieve could be of interest in much wider contexts outside of number theory. This is indeed 
the case, as was shown already partly in the book [25j, and even more convincingly in the 
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recent works of Lubotzky, Meiri and Rosenzweig that we will discuss, some of which prove 
general algebraic statements about linear groups using some forms of sieve methods. 

To present the large sieve in the context of discrete groups, we will use here the very 
simple version from the paper [301 of Lubotzky and Meiri, adapted to our setting. 



Theorem 5.7 (Large sieve). Let T be a group generated by a finite symmetric set S with 
1 G S*. Let F — y Tp be surjective homomorphisms onto finite groups forp ^ pq. Assume 
that: 

(1) For any p ^ q primes ^ po, the induced homomorphisms 

r s. r x r — F 

i ' p ^ 5 p,g 

are onto. 

(2) The family of Cayley graphs of Tp^q and Tp with respect to S is an expander family, 
forp,q ^ Po. 



(3) For some B ^ 1 we have 



Let fip C Tp be such that 



B 



(5.2) > 6 

for some 6 > independent of p. 

Then there exists A > 1 and c > 1 such that for Q = A^, we have 

p(7„es(g;^]))<c-" 

for n large enough, where the sieving is done using primes Po ^ P ^ Q- 

Remark 5.8. (1) Note how the assumptions concerning the group and the Tp are slightly 
weaker versions of those used for the small sieve in Theorem 14. 5[ Thus this version of 
the large sieve applies whenever Theorem 14.51 is applicable. In particular, in view of the 
example at the beginning of this section, we see that this theorem proves Proposition 15.41 
Similarly, for the Dunfield-Thurston random manifolds of Example 15. 6[ this shows that 
the probability that ifi(M0„, Q) 7^ goes to zero exponentially fast. 

(2) It would be unreasonable to expect lower-bounds for the size of sifted sets in 
the large sieve situation, unless the set CP determining the sieving conditions is extremely 
small (so the situation essentially reverts to a bounded sieve f l4.4p ). Indeed, if we consider 
integers and sieve by removing the non-square residue classes modulo p for all p G T, 
which is certainly a large sieve, the right-hand side of the heuristic size of the remaining 
set is (1/2)1^1. If T is the set of primes ^ Q, then this is much smaller than the number 
of squares in {1, . . . , A^}, which certainly remain after the sieve, if Q = N"^ for any fixed 
e > 0. (See [18] for a discussion of the fascinating question of the possibility of an 
"inverse" large sieve statement for integers.) 

We adapt the simple proof in [30] (due to R. Peled; it is reminiscent of some classical 
arguments going back to Renyi and Turan, see [251 Prop. 2.15].) 

Proof. For a fixed n, let Xp denote the Bernoulli random variable equal to 1 when 7rp(7„) G 
Qp and otherwise, and let 



x= Yl Xp. 
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We see that 7„ e S(Q; 17) is tautologically equivalent to the condition X — 0. We can 
compute easily the expectation of X, namely 

E(X) = J]P(7rp(7„)efi,), 
p 

which, by Expansion for (Fp), satisfies 

E(x) = 5^&l + o(g^+V), 
p i^pi 

where ^ < 1 is an upper-bound for the spectral radius of the expansion of the Cayley 
graphs. If Q^'^^ <^ this gives 

E(X) » TTiQ) » 

using the large sieve assumption on the size of ftp. 
We will now use the Chebychev inequality 

P(7„eS(g;Q))=P(X = 0)^^|^, 

where V(X) is the variance of X. We compute 

V{X) = E{{X - E{X)f) = ^(P^ 1) 

p,<i 

by expanding the square, where 

W{p, q) = E{XpX,) - E(Xp) E(X,) 

= P(7rp(7„) e Op and 7r5(7„) e Og) - P(7rp(7„) e Op) P(7r5(7„) e O^) 

(a measure of the correlation between two primes). We isolate the diagonal terms where 
p = q-i for which we use the trivial bound p)| ^ 1, and obtain 

Y{X) ^ g + Q2niax|W^(p,g)|. 

Finally, to estimate q) when p ^ q,we can apply the Independence and Expansion 
Assumptions: we have 

P(7rp(7n) e Op and 7r,(7„) G O,) = + OiQ^^^g-) 

while, by the same argument used for computing the expectation, we have 

PK(7n) e Op) P(7r,(7„) e O,) = + 0(g^^"). 

1^ pi 1^ gl 

The main terms cancel, and therefore 

Q2max|iy(p,?)| <Q2+2B^n_ 

Take Q as large as possible so that Q^^"^^ < 1, so that Q ^ for some A > 1. 
Then the Chebychev inequality gives 

P(,„ , a(«; Si)) « W + Q-f)C°«Q)- « 

which is of exponential decay in terms of n. □ 
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Remark 5.9. (1) Clearly, one can restrict the large sieve assumption fl5.2p to a subset of 
primes with positive natural density (e.g., some arithmetic progression) without changing 
the conclusion, and this is often useful. 

(2) This very simple proof is well-suited to situations where precise information on the 
expansion constant of the relevant Cayley graphs is missing (as is most often the case). 
When such information is available, one gets from this argument an explicit constant 
c > 1, and one may wish to get it as large as possible. For this, one can use rather more 
precise inequalities, as discussed extensively in [25] . 

The point of the large sieve is the fact that it gives a quantitative decay of the size of 
the sifted set in balls. Indeed, if X C F is such that 

X C §{Q;n) 

for all Q large enough, and the flp satisfy f l5.2p . then the bounded sieve f l4.4p is enough 
to imply that 

(5.3) lim P(7„eX)=0, 

n—^+oo 

since, for each finite set /, we get 

limsupP(7n eX)^ lim P(7rp(7„) ^ Qp for p e I) = - ^ (1 - 5)1^1, 



pel 



and letting \I\ — )• +00, we obtain (15. 3p . But this qualitative decay is not sufficient to 
prove Proposition 15.41 

Lubotzky and Meiri introduce the following useful definition: 

Definition 5.10 (Exponentially small sets). Let F be a finitely generated group. A subset 
X C F is exponentially small if, for any finite symmetric generating set S containing 1, 
and with (7^) the corresponding random walk on F, there exists a constant C5 > 1 such 
that 

P(7n e X) < Cs"" 

for n ^ 1. 

Remark 5.11. Thus, we can summarize part of our previous discussion by stating that if X 
is a thin subset of SLr(Q), and F is a finitely generated Zariski-dense subgroup of SLj,(Z), 
then X n F is exponentially small in F, and by saying that the set of mapping classes 
(in a fixed mapping class group F of genus g ^ 1) for which the corresponding manifold 
obtained by Heegaard splitting has positive first rational Betti number is exponentially 
small. 

The first inkling of the large sieve in non-abelian discrete groups is found in appli- 
cations of the qualitative argument above by Dunfield-Thurston [7] and Rivin |m H5] 
in geometric contexts (the second paper |35] of Rivin was the first to obtain exponential 
decay, though its publication was delayed by a journal with overly long backlog; we thank 
I. Rivin for clarifying the priority in publication here). We illustrate further the large 
sieve with an example from the second, and then discuss briefiy two other applications. 

Example 5.12 (Pseudo-Anosov elements of the mapping class group). Let g ^ 1 he 
given and let F be the mapping class group of E^. Thurston's celebrated theory classifies 
the elements 7 G F as (1) reducible; (2) finite-order; or (3) pseudo-Anosov. To quantify 
the feeling that "most" elements are of the third type, Rivin used a criterion based on 
the action of 7 on ifi(Eg,Z), which says that if (but not only if) the characteristic 
polynomial of this action is is irreducible, and satisfies further easy conditions, 
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then 7 is pseudo-Anosov. One then notes that if is reducible, then so is its reduction 
modulo any prime, so vrp(7) is not in the subset flp of elements of Sp2g(Fp) for which the 
characteristic polynomial is irreducible. A computation that goes back to Chavdarov [HI 
§3] shows that, for some 5 > 0, we have 



I a 



>6>0 



|SP2,(F,) 

for all p, and hence the large sieve applies. A simple further argument deals with the 
other necessary conditions in the pseudo-Anosov criterion, and one concludes that the 
set of non-pseudo-Anosov elements is exponentially small in F. 

It should be said, however, that this proof is to some extent unsatisfactory, because it 
doesn't use the deeper structural and dynamical properties of pseudo-Anosov elements. 
For instance, using the action on homology means that one can not argue similarly for 
subgroups F C F for which the action on homology is small, especially subgroups of the 
Torelli group, which is defined precisely as the kernel of the homomorphism 

F ^ Sp2,(Z) 

giving this action. 

However, Maher |M1 [35] has shown, using more geometric methods, that non-pseudo- 
Anosov elements are exponentially small in any subgroup of F, except those for which 
this property is false for obvious reasons, and his work applies in particular to the Torelli 
subgroup. 

On the other hand, Lubotzky-Meiri pT] and Malestein-Souto [36] (independently) 
have recently found proofs that non-pseudo-Anosov elements are exponentially small in 
the Torelli group using ideas similar to those above. 

Example 5.13 (Powers in linear groups). In ^U\, Lubotzky and Meiri prove the following 
statement using the large sieve. The reader should note that this is, on the face of it, a 
purely algebraic property of finitely generated linear groups. 

Theorem 5.14 (Lubotzky-Meiri). LetT be a finitely generated subgroup o/GLj.(C) for 
some r ^ 2. IfVis not virtually solvable^ then the set X of proper powers, i.e., the set 
of those g ET such that there exists k ^ 2 and h eT with g = h^ , is exponentially small 
in F. 

This strenghtens considerably some earlier work of a Hrushovski, Kropholler, Lubotzky 
and Shalev ^20J. The proof is also very instructive, in particular by showing how sieve 
should be considered as a tool among others: here, one can use the large sieve to control 
elements which are A;-th powers for a fixed k ^ 2, but taking the union over all k ^ 2 
can not be done with sieve alone. So Lubotzky and Meiri use other tools to deal with 
large values of k, in that case based on ideas related to the work of Lubotzky, Mozes and 
Raghunathan comparing archimedean and word- length metrics [33j. 

Example 5.15 (Typical Galois groups of characteristic polynomials). Our last example 
has been studied by Rivin [H], Jouve-Kowalski-Zywina [21], Gorodnik-Nevo [16] and 
most recently Lubotzky-Rosenzweig [32], who were the first to explicitly consider the 
case of sparse subgroups. However, the underlying idea of probabilistic Galois theory 
goes back to versions of Hilbert's irreducibility theorem, and especially to Gallagher's 
introduction of the large sieve in this context [131. (There are also relations with works 
of Prasad and Rapinchuk 



I.e., there is no finite-index solvable subgroup of F. 
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In the (most general) version of Lubotzky-Rosenzweig, one considers a finitely gener- 
ated field K (Z C and a finitely generated subgroup F C GLr{K) for some r ^ 2. The 
basic question is: what is the "typical" behavior of the splitting field of the characteristic 
polynomial det(T — g) E K\r] for some element (7 G F? 

This can be studied using the large sieve, as we explain in the simplest case when 
F C SLr(Z). Let G be the Zariski- closure of F, and assume that G is connected and split 
over Q, for instance G = SL^. Let W be the Weyl group of G: this will turn out to be 
the typical Galois group in this case. 

To see this, the first ingredient is the existence, for any prime p large enough (such 
that G can be reduced modulo p), of a certain map 



going back to Carter and Steinberg, where denotes the set of conjugacy classes of 
a finite group and the subscript r restricts to regular semisimple elements in the finite 
group G(Fp). 

This map is used to detect elements in the Galois groups of elements in F in the 
following way. First, for g G SLr(Z), let Pg be the characteristic polynomial and let Kg 
be its splitting field, Galg its Galois group over Q. The point is that, if (7 is a regular 
semisimple element of F, it is shown in [21j that there exists an injective homomorphism 

jg : Galg ^ W, 

canonical up to conjugation, such that if p is any prime unramified in Kg, the Frobenius 
conjugacy class at p maps under jg to the conjugacy class Lp{np[g)) G W^. Thus one 
can detect whether the image of Gal^ in W intersects various conjugacy classes by seeing 
where the reduction modulo p oi g lies with respect to (p. As it turns out, the image of 
if becomes equidistributed among the conjugacy classes 'm.W as p becomes large. Using 
this, it is not too hard to show that if a G is a given conjugacy class and if VLp denotes 
the set of (7 G G(Fp) such that ip{g) ^ a, then these sets satisfy a large sieve density 
assumption 

la 



|G(F,) 



^ 5„ > 



for some 5^ > and all p large enough. It follows by the large sieve that the probability 
that the element 7„ at the n-th step of a random walk on F has Galois group such that 
jg(Galg) fla = is exponentially small. This holds for all the finitely many classes in W, 
and a well-known lemma of Jordaifl allows us to conclude that the set of (7 G F where jg 
is not onto is exponentially small. 

The general case treated by Lubotzky-Rosenzweig is quite a bit more involved. In 
particular, new phenomena appear when G is not connected, and the different cosets of 
the connected component of the identity then usually have different typical Galois groups. 
We refer to their paper for details. 

6. Problems and questions 

We discuss here a few questions and problems, selected to a large extent according to 
the author's own interests and bias. 

(1) [Effective results] A striking aspect of the results we have described is how little 
they use the many refinements and developments of sieve theory, as described 
in ^ for small sieves, and in [25] for the large sieve. This is due to the almost 



^ In a finite group G, there is no proper subgroup H such that H C\a ^% for aU conjugacy classes a 
in G. 
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complete absence of explicit forms of the Expansion Assumption for sparse groups, 
from which it follows that one can not, for instance, give a numerical value of the 
integer u guaranteed to exist in Corollary 14.81 (recall that in classical sieves, the 
current state of the art is very refined indeed: one knows, for instance, that the 
number of primes p ^ x such that p + 2 has at most two prime factors is of the 
right order of magnitude). In fact, when implementing the combinatorial counting 
methods (either word- length or random walks), there is no known explicit sieve 
statement, as far as the author know^ (whereas a few explicit bounds do exist 
for archimedean balls, based on spectral or ergodic methods, see, e.g., the works 
of Kontorovich [22], Kontorovich-Oh [21], Nevo-Sarnak [27], Liu-Sarnak [29j, 
and Gorodnik-Nevo [T3], or for random walks in a few arithmetic groups [25] ). 
It seems clear that the current proofs of expansion for sparse groups, although 
they are effective, would lead to dreadful bounds on a suitable u (see [27] for a 
numerical upper-bound on the spectral radius for Cayley graphs of Zariski-dense 
subgroups of SL2(Z) modulo primes, which suggests, e.g., that one could not get 
better than u of size at least 2^*° or so for the product of coordinates function on 
the Lubotzky group. . . ). 
(2) [Average expansion?] One possibility suggested by the classical Bombieri-Vino- 
gradov Theorem is to attempt a proof of expansion "on average" for the relevant 
Cayley graphs: for many applications, it would be sufficient to prove estimates 
for quantities like 



and such estimates could conceivably be provable without resorting to individual 
estimates for each qj. They could also, optimistically, be of better quality than 
what is true for individual /. (Such a property is known for classical sieve, by 
work of Fouvry, Bombieri, Friedlander and Iwaniec). 

(3) [Combinatorial balls] It would be very interesting to have equidistribution and 
sieve results using trunctions based on word-length balls, without resorting to 
random walks. Here, the hope is that one might not need to compute the asymp- 
totics of the size of the combinatorial balls, since one is only interested in relative 
proportions of elements in a ball mapping to a given g G Tp. 

(4) [Reverse power] This question is related to (1): at least in some cases, one has 
very convincing conjectures for the counting function of primes arising from small 
sieve in orbits (see, e.g., [IHIIII])- Suppose one assumes such conjectures. What 
does this imply for prime numbers? In other words, can one exploit information 
on primes represented using the sieve in orbits to derive other properties of prime 
numbers? Here the reference to keep in mind is the result of Gallagher (see [12] 
and the generalization in [2H]) that shows that uniform versions of the Hardy- 
Littlewood fc-tuples conjecture imply that the number of primes p ^ x in intervals 
of length Alogx, for fixed A > 0, is asymptotically Poisson-distributed. 
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